Mirex-2012 “audio Key Detection” Task: Ircamkeymode
نویسنده
چکیده
This extended abstract details a submission to the Music Information Retrieval Evaluation eXchange (MIREX) 2012 for the “Audio Key Detection” task. The system named ircamkeymode performs key (C, Db, D, E, ...) and mode (Major, minor) detection. The system is a simplified version of the systems described in [6] [5]. We briefly summarized it below. 1. OVERVIEW OF THE MODEL 1.1 Chromagram extraction: The signal is first converted to mono and down-sampled to 11.025 Hz. At each frame, the DFT of the signal is computed using a Blackman analysis window of length L = 0.3715s with a hop-size of L/2. After normalization by its maximum value, the amplitude DFT is converted to a Sone scale. The computation of the sone-converted values is similar to the one used in [4]. Thresholding (below 1% of the max value) and peak-picking are then applied. A 36-bins (3 bins for each semi-tone) chroma representation [7] [1] is then computed. Only frequencies between 100 and 2000 Hz are considered. The shape of the chroma filters is chosen as an hyperbolic tangent with 50% overlap. Smoothing over time of each of the 36-chroma channels is performed using median filtering. 1.2 Key/Mode templates creation: We use an approach similar to Gomez [2]: the key profiles are created by extending Krumhansl & Schmukler (Temperley or Diatonic) pitch distribution profile to the polyphonic (several pitches) and audio (several harmonics for each pitch) cases. For each key, we consider the three main triads in this key: the tonic, dominant and subdominant triads (for example in C Major: C-E-G, G-B-D, F-A-C). The chroma vector corresponding to each single note of a specific triad is computed by adding the contribution of its harmonics h. The harmonic h is given a contribution of 0.6h−1. Only the first 4 harmonics are considered. For a specific triad, the chroma vectors corresponding to the three notes are added. Finally for a specific key, the key-chroma vector is computed by adding This document is licensed under the Creative Commons Attribution-Noncommercial-Share Alike 3.0 License. http://creativecommons.org/licenses/by-nc-sa/3.0/ c © 2011 The Authors. the three triad-chroma vectors. Each triad-chroma vector is weighted by the value of the Krumhansl’s (Temperley or Diatonic) profile at the position corresponding to the position of the root of the triad in the key (for example 6 for the F-A-C triad in C Major). The result is a 12 dimensions chroma profile vector for each of the 24 keys: Ci i ∈ [1, 24]. 1.3 Key/Mode decision: The most likely key/mode of the track is estimated using an approach similar to Izmirli [3]. The chroma vectors c(t) are extracted on a frame basis. At each time t, we estimate the key Ci that has the highest correlation (we use the cosine distance) with a cumulated-over-time chromavector 1 . We attribute a score to this key proportional to the distance between its correlation value and the correlation value of the second most likely key. This score acts as a reliability coefficient. The final key decision is chosen as the key with the maximum score cumulated over time. Only the first 20 seconds of the tracks are considered. 2. FLOWCHART OF THE MODEL
منابع مشابه
Mirex-2013 “audio Key Detection” Task: Ircamkeymode
This extended abstract details a submission to the Music Information Retrieval Evaluation eXchange (MIREX) 2013 for the “Audio Key Detection” task. The system named ircamkeymode performs key (C, Db, D, E, ...) and mode (Major, minor) detection. The system is a simplified version of the systems described in [6] [5]. We briefly summarized it below. 1. OVERVIEW OF THE MODEL 1.1 Chromagram extracti...
متن کاملMirex-2010 “audio Key Detection” Task: Ircamkeymode
This extended abstract details a submission to the Music Information Retrieval Evaluation eXchange (MIREX) 2010 for the “Audio Key Detection” task. The system named ircamkeymode performs key (C, Db, D, E, ...) and mode (Major, minor) detection. The system is a simplified version of the systems described in [1] [2]. We briefly summarized it below. 1. OVERVIEW OF THE MODEL 1.1 Chromagram extracti...
متن کاملMirex-2011 “audio Key Detection” Task: Ircamkeymode
This extended abstract details a submission to the Music Information Retrieval Evaluation eXchange (MIREX) 2011 for the “Audio Key Detection” task. The system named ircamkeymode performs key (C, Db, D, E, ...) and mode (Major, minor) detection. The system is a simplified version of the systems described in [6] [5]. We briefly summarized it below. 1. OVERVIEW OF THE MODEL 1.1 Chromagram extracti...
متن کاملMirex 2011: Audio Key Detection System with Statistical Key Profiles
This extended abstract details a submission to the Music Information Retrieval Evaluation eXchange (MIREX) 2011 for the Audio Key Detection task. The algorithm named “cbmirex2011 pdfs modes” is a novel method that performs the key and mode detection. The main innovation is the use of two set of probability density functions (PDFs), computed by the experimental data, to generate the statistical ...
متن کاملMirex 2010: Key Recognition with Zweiklang Profiles
This extended abstract details a submission to the Music Information Retrieval Evaluation eXchange (MIREX) 2012 for the Audio Key Detection task. The system performs key (C, Db, D, E, ...) and mode (major, minor) detection. The system uses a new algorithm called Zweiklang-Profiling. A zweiklang is the combination of two pitches, which we determine by detecting the two most prominent chroma valu...
متن کامل